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Autism-associated SHANK3 missense point mutations impact conformational fluctuations and protein turnover at synapses 
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Detailed protocol 


Required software and sources 


All required software is freely available: 


e Preparation of mutant structures (R12C and L68P): UCSF Chimera (Pettersen et al., 2004) 
e MD simulations using the Amber03 force-field: GROMACS (Duan et al., 2003; Hess et al., 2008; Pronk et al., 2013) 
© Gromacs is freely available here: gromacs.org 
o Recommended online tutorial: http:/Avww.mdtutorials.com/gmx/ 
e For data analysis and plots: Python and its modules numpy (Harris et al. 2020), scipy (Virtanen et al. 2020), matplotlib (Hunter 2007). 
e Visualization of structures and data analysis: VMD 1.9.3. (Humphrey et al., 1996) 


Included files 


e MD parameter (mdp) files: these files are needed to prepare the run input files for each MD step using "gmx grompp" (adapted from 
http:/Avwww.mdtutorials.com/gmx/) 
e Starting structures: 
o Wild-type structure 5G4X and corresponding protein-only structure (without waters and additives) 
o Mutant structures: L68P and R12C (created with Chimera) 
e TCL script for VMD used for RMSF calculations: 
° See also https:/Awww.ks.uiuc.edu/Research/vmd/ 


Preparation of starting structures 


As starting structure for the wild-type system we have used the pdb entry 5G4X from https://www.rcsb.org/ 

The mutants R12C and L68P were generated by replacing the respective residues in UCSF Chimera using the Dunbrack 2010 backbone-dependent rotamer 
library (Shapovalov & Dunbrack, 2011). Alternatively, other programs such as Pymol can be used for this step. Individual side chains can be replaced in 
Chimera using the Rotamers tool (Tools... Structure Editing). Thereby individual side chain rotamers can be selected based on their predicted probabilities 
and can be incorporated into the corresponding protein structure. This, however, does not modify the backbone and may initially produce clashes or steric 
repulsions such that the generated mutants require subsequent minimization. 


Preparation of MD systems 


The following instructions are for the wild type (WT) system. The mutant systems can be prepared using the same protocol. The only difference is the charge of 
the protein system, i.e. the number of ions that are required to neutralize the system. 


Preparation of the MD simulation box 


e Create a folder named WT and copy the mdp input files (folder in attachment) and 5G4X.pdb (from protein database) into this folder 
e Clean pdb by removing all lines starting with HETATM from 5G4X.pdb using a text editor: name the new pdb 5G4X_onlyprot.pdb 
e Create Gromacs coordinate file prot.gro from pdb 
© gmx pdb2gmx -f 5G4X_onlyprot.pdb -o prot.gro 
= Choose Amber03 as force field (here number 1) and TIP3P as explicit water model (here number 1) 
= Total charge -1 (for R12C this will be -2) 


= Note: Depending on the protonation criteria of the program used, different charges may be obtained. 
e Place structure in box with 1 nm distance of the protein to each box face 
© gmx editconf -f prot.gro -o box.gro -c -d 1 -bt cubic 
e Fill box with water molecules 
© gmx solvate -cp box.gro -cs spc216.gro -p topol.top -o solv.gro 
e Add ions to neutralize charge: In this case we need to add 1 Na* ion 
° First create the input file with grompp. There will be one warning about the net charge being not zero. We can ignore this one since we will add 
counterions in the second step 
= gmx grompp -f ions.mdp -c solv.gro -p topol.top -o ions.tpr -maxwarn 1 
© Nowrun the actual ion placement job choosing the “SOL” group as solvent atoms (here number 13) 
= gmx genion -s ions.tpr -o ions.gro -p topol.top -pname NA -nname CL -np 1 
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Fig 1: Solvation and charge neutralization of an N-terminal fragment of SHANK3 

(A) The X-ray structure of the N-terminal fragment of SHANK3 including the SPN- and ARR-domain is shown (PDB: 5G4X; Lilja et al., 2017). Both domains ar 
connected by a flexible linker. (B) After preparation of the MD system, the protein is solvated in a cubic box and charge neutralized by the addition of a Na* ion. 
Images were created using VMD 1.9.3. 


Equilibration of the MD simulation box 


© Setup and run minimization. The output files will have the prefix em. One may (optionally, requires CUDA) use the GPU for nonbonded interactions with 
the flag “-nb gpu’, specify the number of threads used with the -nt flag and show more information during the run with -v. In case you run your simulation on 
a node/computer that is shared, the -pin on option avoids pinning threads from different mdrun instances to the same core. These optional parameters 
apply to all following mdrun commands. 
© gmx grompp -f minim.mdp -c ions.gro -p topol.top -o em.tpr 
© gmxmdrun -v -nb gpu -nt 8 -pin on -deffnm em 
e Prepare and run 100 ps of NVT equilibration 
© gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr -r em.gro 
© gmx mdrun -deffnm nvt 
e Prepare and run 100 ps of NPT equilibration 
© gmx grompp -f npt.mdp -c nvt.gro -t nvt.cpt -p topol.top -o npt.tpr -r nvt.gro 
© gmx mdrun -deffnm npt 
e Now your system is ready for the production run 


Preparation of production run 


e Setup and runa 1000 ns MD simulation (this was done ona high performance computer) 
© gmx grompp -f md_long.mdp -c npt.gro -t npt.cpt -p topol.top -o md_long.tpr 
© gmx mdrun -deffnm md_long -v -pin on -nt 8 -nb gpu 
e Create separate folders for the mutants R12C and L68P and repeat all the aforementioned steps. 
© Note: R12C will have a different charge which has to be taken into account in the neutralization step (cf. Preparation of the MD simulation box) 


Processing of MD simulation prior to analysis 


Before analyzing the trajectory it is necessary to center the protein molecule in the box (remove jumps in the structure from one side of the box to the other) and 
to remove rotation and translation (optional) in order to visually inspect the trajectory in VMD. We recommend this prior to analyzing the RMSF and RMSD 
shown in the next section. 


© Create a trajectory with the protein centered (based on the C-alpha atoms). For our analysis we only need the protein coordinates, which saves space. 
Therefore we run the following command with the C-alpha group (typically 3) for centering and the protein group (typically 1) as output 


© gmx trjconv -pbc mol -center -s md_long.tpr -f md_long.xtc -o md_long_cent.xtc 
e Create a centered structure that can be used later with VMD. Again we choose the C-alpha group (3) for centering and the protein group (1) for output 
using the following command: 
© gmx trjconv -f md_long.tpr -pbc mol -center -o md_long_rottrans.gro -s md_long.tpr 
e The final step of processing the trajectory is to remove rotation and translation of the protein. For the least squares fit we use the C-alpha group (3) and 
for output the protein group (1): 
© gmx trjconv -s md_long.tpr -f md_long_cent.xtc -o md_long_rottrans.xtc -fit rot+trans 


Analysis of MD simulations 


RMSF analysis 


RMSF analysis can be performed in multiple ways either using gromacs or an external software such as VMD 1.9.3. Subsequently, RMSF analysis with VMD is 
illustrated. First load the centered trajectory without rotation and translation into VMD using “prot.gro” to define the molecule and “md_long_rottrans.xtc” to 
define the trajectory. For RMSF and RMSD analysis, every fifth frame was loaded (Stride: 5). To perform RMSF analysis, copy the file “RMSF.TCL” (provider 
as a supplementary file) to the VMD installation directory. By default, this is /usr/local/lib/vmd on Unix systems, and C:\Program Files\University of Illinois\V MD 
on Windows systems. Within this file, adjust the frame numbers in the actual RMSF measurement command depending on your total number of frames: 


e set rmsf [measure rmsf $sel first 0 last 20002 step 1] 
Save the adjusted file and run the script from the Tk console within VMD using the command: 
e source rmsf.tcl 


A text file named “RMSF.ttt” will be saved in the same working directory containing two columns. The first column represents the protein residue number and 
the second column gives the measured RMSF values in Angstrom. The data can subsequently be plotted with any graphing software. 


Alternative RMSF calculation using gromacs tools 


To perform RMSF analysis with gromacs first create an index file: 
e gmx make_ndx -f md_long_rottrans 


We will then use the trajectory to calculate the RMSF for the protein using the C-alpha group (usually selection 3). The -res flag will result in an output of RMSF 
vs. residue number (instead of atom number): 


© gmxrmsf -f md_long_rottrans.xtc -s md_long_rottrans.gro -n index.ndx -res 


The created output file rmsf.xvg contains two columns: the first one are the residue numbers, the second one are the RMSF values. This data can be loaded 
and visualized with a program of your choice. One quick way to visualize it is with xmgrace (needs to be installed): 


¢ xmgrace rmsf.xvg 


RMSD analysis 


Within VMD, RMSD analysis can be performed using the RMSD Trajectory Tool (Extensions > Analysis > RMSD Trajectory Tool). Therefore, the same 
trajectory is used as for the RMSF analysis (md_long_rottrans.xtc; Stride: 5). Within the RMSD Trajectory Tool make sure to define the selected atoms as 
“protein”. To analyze the RMSD only for a part of the protein (e.g. an individual domain), specify the corresponding residues instead (e.g. “residue 1 to 92”). 

Then specify the following parameters: 


e Selection modifiers = backbone 
e Reference mol = Top 
e Trajectory ref frame = 1 


Subsequently the measured RMSD values can be saved as separate file as plotted with any graphing software. 
Alternative RMSD calculation using gromacs tools 


For obtaining the RMSD using the gromacs tool rms use the following command: 


® gmxrms -s md_long.tor -f md_long_rottrans.xtc -o rmsd.xvg -tu ns 
e¢ Choose C_alpha both for the least-squares fit and the subsequent RMSD calculation 


The created output file rmsf.xvg contains two columns: the first one contains the simulation time in ns, the second one are the RMSD values (in nm). Similarly to 
the RMSF values, the RMSD can for instance be visualized using xmgrace: 


© xmgrace rmsd.xvg 
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